Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove foreign key constraints from MyISAM table #1232

Merged

Conversation

jtpalmer
Copy link
Contributor

Description

Removes foreign key constraints from the ETL table definition for modw_cloud.domain_submission_venues.

Motivation and Context

MyISAM tables do not support foreign key constraints. This results in the ETL system attempting to alter the table every time data is ingested:

2020-02-18 13:01:40 [debug] Discover table 'modw_cloud.domain_submission_venues'
2020-02-18 13:01:40 [notice] Altering table `modw_cloud`.`domain_submission_venues`
2020-02-18 13:01:40 [debug] Alter table SQL ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=localhost:3306, user=xdmod):
ALTER TABLE `modw_cloud`.`domain_submission_venues`
ADD CONSTRAINT `fk_dsv_domain_id` FOREIGN KEY (`domain_id`) REFERENCES `domains` (`id`) ON DELETE CASCADE,
ADD CONSTRAINT `fk_dsv_submission_venue_id` FOREIGN KEY (`submission_venue_id`) REFERENCES `submission_venue` (`submission_venue_id`);

Tests performed

No tests.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project as found in the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@jtpalmer jtpalmer added bug Bugfixes Category:ETL Extract Transform Load labels Feb 18, 2020
@jtpalmer jtpalmer added this to the 9.0.0 milestone Feb 18, 2020
@sonarcloud
Copy link

sonarcloud bot commented Feb 18, 2020

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities (and Security Hotspot 0 Security Hotspots to review)
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@jtpalmer
Copy link
Contributor Author

Alternatively, we could consider changing all the tables in modw_cloud to be InnoDB.

@ryanrath
Copy link
Contributor

My preference would be to change the default to InnoDB, but if that doesn't work then we could just update this table to be InnoDB or update the FK check so that it only runs on InnoDB tables.

@jtpalmer
Copy link
Contributor Author

I'm not sure which default you're referring to, but this table is explicitly set to MyISAM. Unfortunately, the ETL system doesn't handle changes to the storage engine very well:

2020-02-18 15:45:26 [debug] Discover table 'modw_cloud.domain_submission_venues'
2020-02-18 15:45:26 [notice] Altering table `modw_cloud`.`domain_submission_venues`
2020-02-18 15:45:26 [debug] Alter table SQL ('Cloud DB', class=ETL\DataEndpoint\Mysql, config=datawarehouse, schema=modw_cloud, host=localhost:3306, user=xdmod):
ALTER TABLE `modw_cloud`.`domain_submission_venues`
ENGINE = innodb,
ADD CONSTRAINT `fk_dsv_domain_id` FOREIGN KEY (`domain_id`) REFERENCES `domains` (`id`) ON DELETE CASCADE,
ADD CONSTRAINT `fk_dsv_submission_venue_id` FOREIGN KEY (`submission_venue_id`) REFERENCES `submission_venue` (`submission_venue_id`);
2020-02-18 15:45:26 [error] xdmod.jobs-cloud-common.CloudTableManagement (ETL\Maintenance\ManageTables): Error managing ETL table `modw_cloud`.`domain_submission_venues`: SQLSTATE[HY000]: General error: 1005 Can't create table 'modw_cloud.#sql-3e4f_1685' (errno: 150)
2020-02-18 15:45:26 [warning] Stopping ETL due to exception in xdmod.jobs-cloud-common.CloudTableManagement (ETL\Maintenance\ManageTables)
2020-02-18 15:45:26 [info] Releasing lock file '/tmp/etlv2_20202'
2020-02-18 15:45:26 [critical] Ingestion failed: xdmod.jobs-cloud-common.CloudTableManagement (ETL\Maintenance\ManageTables): Error managing ETL table `modw_cloud`.`domain_submission_venues`: SQLSTATE[HY000]: General error: 1005 Can't create table 'modw_cloud.#sql-3e4f_1685' (errno: 150) (stacktrace: #0 /usr/share/xdmod/classes/ETL/Maintenance/ManageTables.php(152): CCR\Loggable->logAndThrowException('Error managing ...')
#1 /usr/share/xdmod/classes/ETL/EtlOverseer.php(473): ETL\Maintenance\ManageTables->execute(Object(ETL\EtlOverseerOptions))
#2 /usr/share/xdmod/classes/ETL/EtlOverseer.php(435): ETL\EtlOverseer->_execute('xdmod.jobs-clou...', Object(ETL\Maintenance\ManageTables))
#3 /usr/share/xdmod/classes/ETL/Utilities.php(279): ETL\EtlOverseer->execute(Object(ETL\Configuration\EtlConfiguration))
#4 /usr/share/xdmod/classes/OpenXdmod/DataWarehouseInitializer.php(269): ETL\Utilities::runEtlPipeline(Array, Object(Log_composite))
#5 /usr/share/xdmod/classes/OpenXdmod/DataWarehouseInitializer.php(140): OpenXdmod\DataWarehouseInitializer->ingestCloudDataGeneric()
#6 /usr/bin/xdmod-ingestor(257): OpenXdmod\DataWarehouseInitializer->ingestAll(NULL, NULL)
#7 /usr/bin/xdmod-ingestor(21): main()
#8 {main})

Disabling the check is a bad idea because it's misleading to have a configuration file with a foreign key constraint defined when no such constraint exists on the actual table. If anything I would prefer that the ETL system consider this configuration to be invalid since it's not supported by the underlying storage engine.

@ryanrath
Copy link
Contributor

wow, that's definitely my bad ( having this table set to 'MyISAM' ) it should definitely by InnoDB.

The "default" thing was just my brains way of interpreting your "Alternatively, we could consider changing all the tables in modw_cloud to be InnoDB." as in, the default table engine would be InnoDB for modw_cloud. Yeah... my brains weird shrug.

Oh, I wouldn't want to disable it wholesale, but only have it run for tables that can actually support keys. And I 100% agree that it would be ideal if the ETL System complained about this as being an invalid configuration.

So yeah, I think the easiest overall fix would be to just change this tables engine type to InnoDB as that's what it should have been in the first place hangs head in shame

@jtpalmer
Copy link
Contributor Author

jtpalmer commented Feb 19, 2020

I'm still looking into what needs to change to make the ETL system alter the table engine properly. Both modw.submission_venue and modw_cloud.domains will also need to be changed to InnoDB since they are referenced by the foreign key constraints. Also, the current foreign key constraint implementation doesn't support constraints to tables in a different schema (I didn't realize that submission_venue is not in modw_cloud), but that should be straight forward enough to add. I am concerned with the possible performance implications of changing both those tables since they are used in data warehouse queries.

This works:

ALTER TABLE `modw_cloud`.`domains` ENGINE=InnoDB;
ALTER TABLE `modw`.`submission_venue` ENGINE=InnoDB;
ALTER TABLE `modw_cloud`.`domain_submission_venues` ENGINE=InnoDB,
ADD CONSTRAINT `fk_dsv_domain_id` FOREIGN KEY (`domain_id`) REFERENCES `domains` (`id`) ON DELETE CASCADE,
ADD CONSTRAINT `fk_dsv_submission_venue_id` FOREIGN KEY (`submission_venue_id`) REFERENCES `modw`.`submission_venue` (`submission_venue_id`);

I'll leave this pull request up for now so that anyone else can comment if they'd like.

@jpwhite4
Copy link
Member

Anyone know why submission venue is in modw?

@jtpalmer
Copy link
Contributor Author

It looks like submission venue is used by XSEDE OSG jobs (and therefore was already in use before the cloud submission venue was added). I haven't verified that, but there is some ETL configuration that implies that that is ingested somewhere.

@jtpalmer
Copy link
Contributor Author

I'm merging this in now. I'll have more changes to try and support these constraints later.

@jtpalmer jtpalmer merged commit 31c2851 into ubccr:xdmod9.0 Mar 12, 2020
@jtpalmer jtpalmer deleted the fix-cloud-domain-submission-venues-table-def branch March 12, 2020 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugfixes Category:ETL Extract Transform Load
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants